AITopics | noise-contrastive estimation

Collaborating Authors

noise-contrastive estimation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Provable benefits of annealing for estimating normalizing constants Anonymous Author(s) Affiliation Address email

Neural Information Processing SystemsFeb-15-2026, 20:43:42 GMT

In fact, in a particular limit, the optimal path is arithmetic.

artificial intelligence, estimation error, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

UnsupervisedJointk-nodeGraphRepresentations withCompositionalEnergy-BasedModels

Neural Information Processing SystemsFeb-10-2026, 09:29:58 GMT

We propose MHM-GNN, an inductive unsupervised graph representation approach that combines jointk-node representations with energy-based models (hypergraph Markov networks) and GNNs.

artificial intelligence, machine learning, representation, (18 more...)

Neural Information Processing Systems

Country:

South America > Brazil (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Provable benefits of annealing for estimating normalizing constants Anonymous Author(s) Affiliation Address email

Neural Information Processing SystemsOct-9-2025, 01:22:50 GMT

In fact, in a particular limit, the optimal path is arithmetic.

artificial intelligence, estimation error, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

90080022263cddafddd4a0726f1fb186-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 01:22:46 GMT

Recent research has developed several Monte Carlo methods for estimating the normalization constant (partition function) based on the idea of annealing.

artificial intelligence, estimation error, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > France (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Add feedback

Learning word embeddings efficiently with noise-contrastive estimation

Neural Information Processing SystemsSep-30-2025, 12:57:12 GMT

Continuous-valued word embeddings learned by neural language models have recently been shown to capture semantic and syntactic information about words very well, setting performance records on several word similarity tasks. The best results are obtained by learning high-dimensional embeddings from very large quantities of data, which makes scalability of the training method a critical factor. We propose a simple and scalable new approach to learning word embeddings based on training log-bilinear models with noise-contrastive estimation. Our approach is simpler, faster, and produces better results than the current state-of-the art method of Mikolov et al. (2013a). We achieve results comparable to the best ones reported, which were obtained on a cluster, using four times less data and more than an order of magnitude less computing time. We also investigate several model types and find that the embeddings learned by the simpler models perform at least as well as those learned by the more complex ones.

learning word, name change, noise-contrastive estimation, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Review for NeurIPS paper: Noise-Contrastive Estimation for Multivariate Point Processes

Neural Information Processing SystemsJan-23-2025, 11:01:19 GMT

The paper derives a new estimation method for multi-variate point processes that is based on the'ranking'-variant of NCE. The paper is borderline: two reviewers think that the difference to previous work by Gao (who use NCE to estimate point-processes) and the empirical comparison is not sufficient. Two other reviewers disagree, with one in particular arguing that the paper should be accepted. The meta-reviewer thinks that the theory in the paper is sufficiently different from Gao's work, and that the theoretical aspects of the paper are deeper and more rigorous. The results do not follow directly from previous work by Gutmann & Hyvarinen (2012) or Ma & Collins (2018). The empirical results are good and the method should be useful in practice.

multivariate point process, neurips paper, noise-contrastive estimation, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.56)

Add feedback

Noise-Contrastive Estimation for Multivariate Point Processes

Neural Information Processing SystemsOct-9-2024, 23:28:29 GMT

The log-likelihood of a generative model often involves both positive and negative terms. As a result, maximum likelihood estimation is expensive. We show how to instead apply a version of noise-contrastive estimation---a general parameter estimation method with a less expensive stochastic objective. Our specific instantiation of this general idea works out in an interestingly non-trivial way and has provable guarantees for its optimality, consistency and efficiency. On several synthetic and real-world datasets, our method shows benefits: for the model to achieve the same level of log-likelihood on held-out data, our method needs considerably fewer function evaluations and less wall-clock time.

multivariate point process, noise-contrastive estimation

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Learning word embeddings efficiently with noise-contrastive estimation

Neural Information Processing SystemsMar-13-2024, 21:23:05 GMT

Continuous-valued word embeddings learned by neural language models have recently been shown to capture semantic and syntactic information about words very well, setting performance records on several word similarity tasks. The best results are obtained by learning high-dimensional embeddings from very large quantities of data, which makes scalability of the training method a critical factor. We propose a simple and scalable new approach to learning word embeddings based on training log-bilinear models with noise-contrastive estimation. Our approach is simpler, faster, and produces better results than the current state-of-theart method. We achieve results comparable to the best ones reported, which were obtained on a cluster, using four times less data and more than an order of magnitude less computing time. We also investigate several model types and find that the embeddings learned by the simpler models perform at least as well as those learned by the more complex ones.

language model, representation, word representation, (15 more...)

Neural Information Processing Systems

Country: Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Provable benefits of annealing for estimating normalizing constants: Importance Sampling, Noise-Contrastive Estimation, and beyond

Chehab, Omar, Hyvarinen, Aapo, Risteski, Andrej

arXiv.org Machine LearningOct-9-2023

Recent research has developed several Monte Carlo methods for estimating the normalization constant (partition function) based on the idea of annealing. This means sampling successively from a path of distributions that interpolate between a tractable "proposal" distribution and the unnormalized "target" distribution. Prominent estimators in this family include annealed importance sampling and annealed noise-contrastive estimation (NCE). Such methods hinge on a number of design choices: which estimator to use, which path of distributions to use and whether to use a path at all; so far, there is no definitive theory on which choices are efficient. Here, we evaluate each design choice by the asymptotic estimation error it produces. First, we show that using NCE is more efficient than the importance sampling estimator, but in the limit of infinitesimal path steps, the difference vanishes. Second, we find that using the geometric path brings down the estimation error from an exponential to a polynomial function of the parameter distance between the target and proposal distributions. Third, we find that the arithmetic path, while rarely used, can offer optimality properties over the universally-used geometric path. In fact, in a particular limit, the optimal path is arithmetic. Based on this theory, we finally propose a two-step estimator to approximate the optimal path in an efficient way.

artificial intelligence, estimation error, machine learning, (14 more...)

arXiv.org Machine Learning

2310.03902

Country: